NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

cache_ext: Customizing the Page Cache with eBPF

https://doi.org/10.1145/3731569.3764820

Zussman, Tal; Zarkadas, Ioannis; Carin, Jeremy; Cheng, Andrew; Franke, Hubertus; Pfefferle, Jonas; Cidon, Asaf (October 2025, ACM)

The OS page cache is central to the performance of many applications, by reducing excessive accesses to storage. However, its one-size-fits-all eviction policy performs poorly in many workloads. While the systems community has experimented with a plethora of new and adaptive eviction policies in non-OS settings (e.g., key-value stores, CDNs), it is very difficult to implement such policies in the page cache, due to the complexity of modifying kernel code. To address these shortcomings, we design a flexible eBPF-based framework for the Linux page cache, called cache_ext, that allows developers to customize the page cache without modifying the kernel. cache_ext enables applications to customize the page cache policy for their specific needs, while also ensuring that different applications’ policies do not interfere with each other and preserving the page cache’s ability to share memory across different processes. We demonstrate the flexibility of cache_ext’s interface by using it to implement eight different policies, including sophisticated eviction algorithms. Our evaluation shows that it is indeed beneficial for applications to customize the page cache to match their workloads’ unique properties, and that they can achieve up to 70% higher throughput and 58% lower tail latency.
more » « less
Free, publicly-accessible full text available October 12, 2026
Concord: Rethinking Distributed Coherence for Software Caches in Serverless Environments

https://doi.org/10.1109/HPCA61900.2025.00043

Stojkovic, Jovan; Alverti, Chloe; Andrade, Alan; Iliakopoulou, Nikoleta; Franke, Hubertus; Xu, Tianyin; Torrellas, Josep (March 2025, IEEE)

Free, publicly-accessible full text available March 1, 2026
EcoFaaS: Rethinking the Design of Serverless Environments for Energy Efficiency

https://doi.org/10.1109/ISCA59077.2024.00042

Stojkovic, Jovan; Iliakopoulou, Nikoleta; Xu, Tianyin; Franke, Hubertus; Torrellas, Josep (June 2024, IEEE)

Full Text Available
MXFaaS: Resource Sharing in Serverless Environments for Parallelism and Efficiency

https://doi.org/10.1145/3579371.3589069

Stojkovic, Jovan; Xu, Tianyin; Franke, Hubertus; Torrellas, Josep (June 2023, International Symposium on Computer Architecture (ISCA))
IEEE (Ed.)
Full Text Available
Remote attestation of confidential VMs using ephemeral vTPMs

https://doi.org/10.1145/3627106.3627112

Narayanan, Vikram; Carvalho, Claudio; Ruocco, Angelo; Almasi, Gheorghe; Bottomley, James; Ye, Mengmei; Feldman-Fitzthum, Tobin; Buono, Daniele; Franke, Hubertus; Burtsev, Anton (December 2023, ACM)

Full Text Available
AWARE: Automate Workload Autoscaling with Reinforcement Learning in Production Cloud Systems

Qiu, Haoran; Mao, Weichao; Wang, Chen; Franke, Hubertus; Yousseff, Alaa; Kalbarczyk, Zbigniew T.; Iyer, Ravishankar K (July 2023, 2023 USENIX Annual Technical Conference (USENIX ATC 23))

Workload autoscaling is widely used in public and private cloud systems to maintain stable service performance and save resources. However, it remains challenging to set the optimal resource limits and dynamically scale each workload at runtime. Reinforcement learning (RL) has recently been proposed and applied in various systems tasks, including resource management. In this paper, we first characterize the state-of-the-art RL approaches for workload autoscaling in a public cloud and point out that there is still a large gap in taking the RL advances to production systems. We then propose AWARE, an extensible framework for deploying and managing RL-based agents in production systems. AWARE leverages meta-learning and bootstrapping to (a) automatically and quickly adapt to different workloads, and (b) provide safe and robust RL exploration. AWARE provides a common OpenAI Gym-like RL interface to agent developers for easy integration with different systems tasks. We illustrate the use of AWARE in the case of workload autoscaling. Our experiments show that AWARE adapts a learned autoscaling policy to new workloads 5.5x faster than the existing transfer-learning-based approach and provides stable online policy-serving performance with less than 3.6% reward degradation. With bootstrapping, AWARE helps achieve 47.5% and 39.2% higher CPU and memory utilization while reducing SLO violations by a factor of 16.9x during policy training.
more » « less
Full Text Available
SpecFaaS: Accelerating Serverless Applications with Speculative Function Execution

https://doi.org/10.1109/HPCA56546.2023.10071120

Stojkovic, Jovan; Xu, Tianyin; Franke, Hubertus; Torrellas, Josep (February 2023, IEEE)

Full Text Available
BPF-oF: Storage Function Pushdown Over the Network

Zarkadas, Ioannis; Zussman, Tal; Carin, Jeremy; Jiang, Sheng; Zhong, Yuhong; Pfefferle, Jonas; Franke, Hubertus; Yang, Junfeng; Kaffes, Kostis; Stutsman, Ryan; et al (September 2023, Arxiv)

Full Text Available
SIMPPO: a scalable and incremental online learning framework for serverless resource management

https://doi.org/10.1145/3542929.3563475

Qiu, Haoran; Mao, Weichao; Patke, Archit; Wang, Chen; Franke, Hubertus; Kalbarczyk, Zbigniew T.; Başar, Tamer; Iyer, Ravishankar K. (November 2022, Proceedings of the 13th ACM Symposium on Cloud Computing (SoCC 2022))

Serverless Function-as-a-Service (FaaS) offers improved programmability for customers, yet it is not server-“less” and comes at the cost of more complex infrastructure management (e.g., resource provisioning and scheduling) for cloud providers. To maintain function service-level objectives (SLOs) and improve resource utilization efficiency, recent research has been focused on applying online learning algorithms such as reinforcement learning (RL) to manage resources. Compared to rule-based solutions with heuristics, RL-based approaches eliminate humans in the loop and avoid the painstaking generation of heuristics. Despite the initial success of applying RL, we first show in this paper that the state-of-the-art single-agent RL algorithm (S-RL) suffers up to 4.8x higher p99 function latency degradation on multi-tenant serverless FaaS platforms compared to isolated environments and is unable to converge during training. We then design and implement a scalable and incremental multi-agent RL framework based on Proximal Policy Optimization (SIMPPO). Our experiments on widely used serverless benchmarks demonstrate that in multi-tenant environments, SIMPPO enables each RL agent to efficiently converge during training and provides online function latency performance comparable to that of S-RL trained in isolation (which we refer to as the baseline for assessing RL performance) with minor degradation (<9.2%). In addition, SIMPPO reduces the p99 function latency by 4.5x compared to S-RL in multi-tenant cases.
more » « less
Full Text Available
Reinforcement learning for resource management in multi-tenant serverless platforms

https://doi.org/10.1145/3517207.3526971

Qiu, Haoran; Mao, Weichao; Patke, Archit; Wang, Chen; Franke, Hubertus; Kalbarczyk, Zbigniew T.; Başar, Tamer; Iyer, Ravishankar K. (April 2022, EuroMLSys 2022 - Proceedings of the 2nd European Workshop on Machine Learning and Systems)

Serverless Function-As-A-Service (FaaS) is an emerging cloud computing paradigm that frees application developers from infrastructure management tasks such as resource provisioning and scaling. To reduce the tail latency of functions and improve resource utilization, recent research has been focused on applying online learning algorithms such as reinforcement learning (RL) to manage resources. Compared to existing heuristics-based resource management approaches, RL-based approaches eliminate humans in the loop and avoid the painstaking generation of heuristics. In this paper, we show that the state-of-The-Art single-Agent RL algorithm (S-RL) suffers up to 4.6x higher function tail latency degradation on multi-Tenant serverless FaaS platforms and is unable to converge during training. We then propose and implement a customized multi-Agent RL algorithm based on Proximal Policy Optimization, i.e., multi-Agent PPO (MA-PPO). We show that in multi-Tenant environments, MA-PPO enables each agent to be trained until convergence and provides online performance comparable to S-RL in single-Tenant cases with less than 10% degradation. Besides, MA-PPO provides a 4.4x improvement in S-RL performance (in terms of function tail latency) in multi-Tenant cases.
more » « less
Full Text Available

« Prev Next »

Search for: All records